Fail-Aware Failure Detectors

نویسندگان

  • Christof Fetzer
  • Flaviu Cristian
چکیده

In existing asynchronous distributed systems it is impossible to implement failure detectors which are perfect, i.e. they only suspect crashed processes and eventually suspect all crashed processes. Some recent research has however proposed that any “reasonable” failure detector for solving the election problem must be perfect. We address this problem by introducing two new classes of fail-aware failure detectors that are 1) implementable in existing asynchronous distributed systems, 2) not necessarily perfect, and 3) can be used to solve the election problem. In particular, we show that there exists a fail-aware failure detector that allows to solve the election problem and which is strictly weaker than a Perfect failure detector.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computer Science and Artificial Intelligence Laboratory Impossibility of Boosting Distributed Service Resilience

We prove two theorems saying that no distributed system in which processes coordinate using reliable registers and f -resilient services can solve the consensus problem in the presence of f + 1 undetectable process stopping failures. (A service is f -resilient if it is guaranteed to operate as long as no more than f of the processes connected to it fail.) Our first theorem assumes that the give...

متن کامل

Survey on Scalable Failure Detectors

Maintaining a timely view of the current system status is essential to the performance and functionality of distributed systems. Failure detectors have long been essential to distributed systems. In this paper, we evaluate two failure detection algorithms specifically aimed at large-scale systems. Both assume fail-stop (non-Byzantine) models but the similarities end there. Dynamo’s failure dete...

متن کامل

Derivation of Fail-Aware Membership Service Specifications

We derive the speci cation of a primary partition and a partitionable fail-aware node membership service in a top-down fashion. The derived speci cations are fail-aware in the sense that each client of a membership server can learn if the server currently provides its standard semantics or an exception semantics because too many failures have occurred. We rst propose the speci cation of an idea...

متن کامل

Failure Detectors for Large-Scale Distributed Systems

This paper discusses the problem of implementing a scalable failure detection service for Grid systems. More specifically, traditional implementations of failure detectors are often tuned for running over local networks and fail to address some important problems found in wide-area distributed systems, such as Grid systems. We identify some of the most important problems raised in the context o...

متن کامل

IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Fail-Aware Untrusted Storage

We consider a set of clients collaborating through an online service provider that is subject to at-tacks, and hence not fully trusted by the clients. We introduce the abstraction of a fail-aware un-trusted service, with meaningful semantics even when the provider is faulty. In the common case,when the provider is correct, such a service guarantees consistency (linearizability) and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996